feat(storage): Enable full object checksum PR 1/3 : parse finalize_time and server crc32c in async object stream#17261
Merged
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces tracking for object finalization and full object CRC32C checksums in the asynchronous read stream. The feedback focuses on simplifying the production code by removing logic added solely to accommodate unit test mocks (such as checking for a .seconds attribute). Instead, it is recommended to mock finalize_time as a standard datetime.datetime or None in the unit tests, which allows the production code to rely on standard isinstance checks.
…est-asyncio issues
kalragauri
approved these changes
May 27, 2026
sofisl
added a commit
that referenced
this pull request
Jun 11, 2026
PR created by the Librarian CLI to initialize a release. Merging this PR will auto trigger a release. Librarian Version: v0.19.0 Language Image: us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:234b9d1f2ddb057ed7ac6a38db0bf8163d839c65c6cf88ade52530cddebce59e <details><summary>gapic-generator: v1.35.0</summary> ## [v1.35.0](gapic-generator-v1.34.1...gapic-generator-v1.35.0) (2026-06-11) ### Features * setup.py matches prerelease versions (#17370) ([25b857e](25b857e1)) ### Bug Fixes * require protobuf 6.33.5 to address CVE-2026-0994 (#17349) ([6642263](66422636)) </details> <details><summary>google-auth: v2.54.0</summary> ## [v2.54.0](google-auth-v2.53.0...google-auth-v2.54.0) (2026-06-11) ### Features * implement regional access boundary support for standalone JWT and async service accounts (#17025) ([35af616](35af6168)) ### Bug Fixes * configure mTLS for impersonated credentials (#17404) ([57269d5](57269d56)) * fail-fast on missing ECP config file to avoid 30s hang (#17377) ([e096127](e0961270)) * Rename the 'seed' argument for setting an initial regional access boundary for clarity (#17186) ([e5c8cf9](e5c8cf92)) * update incorrect urls in setup.py to point at monorepo vs splitrepo (#17237) ([eaed04b](eaed04ba)) </details> <details><summary>google-cloud-alloydb: v0.11.0</summary> ## [v0.11.0](google-cloud-alloydb-v0.10.0...google-cloud-alloydb-v0.11.0) (2026-06-11) ### Features * update API sources and regenerate (#17413) ([59fe7cf](59fe7cf8)) </details> <details><summary>google-cloud-biglake: v0.5.0</summary> ## [v0.5.0](google-cloud-biglake-v0.4.0...google-cloud-biglake-v0.5.0) (2026-06-11) ### Features * update API sources and regenerate (#17431) ([2e75c78](2e75c78c)) </details> <details><summary>google-cloud-ces: v0.7.0</summary> ## [v0.7.0](google-cloud-ces-v0.6.0...google-cloud-ces-v0.7.0) (2026-06-11) ### Features * update API sources and regenerate (#17413) ([59fe7cf](59fe7cf8)) </details> <details><summary>google-cloud-confidentialcomputing: v0.11.0</summary> ## [v0.11.0](google-cloud-confidentialcomputing-v0.10.0...google-cloud-confidentialcomputing-v0.11.0) (2026-06-11) ### Features * update API sources and regenerate (#17413) ([59fe7cf](59fe7cf8)) </details> <details><summary>google-cloud-modelarmor: v0.7.0</summary> ## [v0.7.0](google-cloud-modelarmor-v0.6.0...google-cloud-modelarmor-v0.7.0) (2026-06-11) ### Features * update API sources and regenerate (#17413) ([59fe7cf](59fe7cf8)) </details> <details><summary>google-cloud-network-services: v0.10.0</summary> ## [v0.10.0](google-cloud-network-services-v0.9.0...google-cloud-network-services-v0.10.0) (2026-06-11) ### Features * update API sources and regenerate (#17431) ([2e75c78](2e75c78c)) </details> <details><summary>google-cloud-oracledatabase: v0.6.0</summary> ## [v0.6.0](google-cloud-oracledatabase-v0.5.0...google-cloud-oracledatabase-v0.6.0) (2026-06-11) ### Features * update API sources and regenerate (#17413) ([59fe7cf](59fe7cf8)) </details> <details><summary>google-cloud-spanner: v3.68.0</summary> ## [v3.68.0](google-cloud-spanner-v3.67.0...google-cloud-spanner-v3.68.0) (2026-06-11) ### Features * add asynchronous code snippets and minor cleanup changes (#17337) ([d6aaf61](d6aaf610)) ### Performance Improvements * optimize query result decoding (#17375) ([3f70b2f](3f70b2ff)) </details> <details><summary>google-cloud-storage: v3.12.0</summary> ## [v3.12.0](google-cloud-storage-v3.11.0...google-cloud-storage-v3.12.0) (2026-06-11) ### Features * full object checksum: implement rolling checksum and verification in reads resumption strategy (#17262) ([2361ba6](2361ba6e)) * Enable full object checksum PR 1/3 : parse finalize_time and server crc32c in async object stream (#17261) ([72c7a27](72c7a272)) * full object checksum: integrate full-object checksum in AsyncMultiRangeDownloader (#17263) ([b6a85e4](b6a85e49)) </details> <details><summary>google-developer-knowledge: v0.1.0</summary> ## [v0.1.0](google-developer-knowledge-v0.0.0...google-developer-knowledge-v0.1.0) (2026-06-11) ### Features * add google-developer-knowledge (#17417) ([ca02afc](ca02afce)) </details>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
1. Overview of the Solution
This solution implements end-to-end full-object checksum validation in
AsyncMultiRangeDownloaderfor the asynchronous Google Cloud Storage Python client library. As asynchronous multiplexed downloads of non-contiguous ranges are performed concurrently over a single bidirectional gRPC connection, this feature automatically and incrementally calculates a rolling checksum as bytes arrive and validates it against the server's authoritative object checksum once the download completes.The technical approach consists of three coordinated layers:
_AsyncReadObjectStream(Stream Ingestion): Safely extracts the authoritative server checksum (full_obj_server_crc32c) and finalization status (is_finalized) from the object metadata received in the first data payload response of the stream._ReadResumptionStrategy&_DownloadState(Verification Logic): Computes an isolated, persistent rolling checksum in the individual_DownloadStateobject to ensure calculations do not bleed across concurrent multiplexed ranges. Crucially, the rolling hash updates only after buffer writes succeed to prevent state corruption during retry re-connects, raising aDataCorruptionexception on completion if a mismatch occurs.AsyncMultiRangeDownloader(Orchestration & Cleanup): Detects candidate full-object ranges (e.g.,(0, 0)or(0, persisted_size)), propagates checksum settings to the resumption strategy, and guarantees robust cleanup (closing the stream immediately and unregistering IDs) if data corruption or write errors occur.2. What This PR Specifically Does
This PR implements Step 1: Stream Metadata Ingestion of the solution:
_AsyncReadObjectStreamto safely parse GCS object metadata from the first data payload of the response.is_finalized,full_obj_server_crc32c, andobject_metadataattributes in_AsyncReadObjectStream.open().tests/unit/conftest.pyto resolve compatibility issues withpytest-asynciounder Python 3.11+.test_async_read_object_stream.pyto verify that finalization status and server-authoritative checksums are correctly extracted or skipped for unfinalized objects.